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1 Introduction 



Abstract 

Unmanaged Internet Protocol (UIP) is a fully self- 
organizing network-layer protocol that implements scal- 
able identity-based routing. In contrast with address- 
based routing protocols, which depend for scalability on 
centralized hierarchical address management, UIP nodes 
use a flat namespace of cryptographic node identifiers. 
Node identities can be created locally on demand and 
remain stable across network changes. Unlike location- 
independent name services, the UIP routing protocol can 
stitch together many conventional address-based networks 
with disjoint or discontinuous address domains, providing 
connectivity between any pair of participating nodes even 
when no underlying network provides direct connectivity. 
The UIP routing protocol works on networks with arbi- 
trary topologies and global traffic patterns, and requires 
only 0(log N) storage per node for routing state, enabling 
even small, ubiquitous edge devices to act as ad-hoc self- 
configuring routers. The protocol rapidly recovers from 
network partitions, bringing every node up-to-date in a 
multicast-based chain reaction of 0(log N) depth. Sim- 
ulation results indicate that UIP finds routes that are on 
average within 2 x the length of the best possible route. 



This technical report describes a work in progress and does not con- 
tain complete, fi nal, or polished results. This research was conducted as 
part of the IRIS project (http : //project -iris, net/), supported 
by the National Science Foundation under Cooperative Agreement No. 
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Routing protocols for flat node namespaces are tradition- 
ally limited in scalability by per-node storage or per-node 
routing traffic overheads that increase at least linearly with 
the size of the network. The scalability of today's In- 
ternet to millions and soon billions of nodes is currently 
possible only through address-based routing, in which 
topology information is embedded into structured node 
addresses. Classless Inter-Domain Routing (CIDR) [28] 
enables IP routers to store detailed routing information 
only for nodes and subnets within a local administrative 
domain, aggregating all routing information about more 
distant networks into larger address blocks. 

The scalability of CIDR depends on careful assign- 
ment of node addresses to mirror the structure of the net- 
work, however. Manual IP address assignment is tedious 
and technical, while dynamic assignment [7] makes ad- 
dresses unstable over time and cripples nodes in edge net- 
works that become temporarily disconnected from assign- 
ment services [4]. Organizational resistance plagues IP 
address renumbering efforts [2], and host mobility and 
multihoming violate the hierarchical CIDR model, lead- 
ing to extensions demanding additional care and feed- 
ing [27, 11]. Firewalls and network address translators 
(NATs) create discontinuous address domains [31], mak- 
ing remote access and peer-to-peer communication diffi- 
cult [13]. Finally, new networking technologies may re- 
quire fundamentally different and incompatible address 
architectures [33, 16]. These factors suggest that no 
single address-based routing protocol, let alone a single 
centrally-administered routing domain, may ever provide 
connectivity between every pair of nodes in the world that 
want to communicate. 

UIP is a scalable identity-based internetworking proto- 
col, designed to fill the connectivity gaps left by address- 
based protocols such as IP. UIP stitches together multiple 
address-based layer 2 and layer 3 networks into one large 
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Figure 1: Today's Internetworking Challenges 



"layer 3.5" internetwork, in which nodes use topology- 
free identifiers in a flat namespace instead of hierarchical 
addresses. All UIP nodes act as self-configuring routers, 
enabling directly- or indirectly-connected UIP nodes to 
communicate via paths that may cross any number of ad- 
dress domains. 



1.1 A Motivating Example 

Joe Average User has a Bluetooth-enabled phone, a laptop 
with both Bluetooth and 802. 1 1 support, and several other 
802. 11 -only devices on his home network, as illustrated 
in Figure 1 . He just moved in, however, and does not yet 
have a working Internet connection. With UIP running 
on each of these devices, Joe's Bluetooth phone can com- 
municate through his laptop with all of his other 802. 1 1 
devices. The laptop acts as a self-configuring router for 
all of the devices reachable on his home network, without 
Joe having to assign any addresses manually. 

Joe eventually obtains an Internet connection and de- 
ploys a home NAT, which turns out to be located behind a 
larger NAT deployed by his (cheap) ISP. When his Inter- 
net connection becomes active, Joe's home devices au- 
tomatically merge into the global UIP network and he 
can access them through any other Internet-connected 
UIP host. While at his friend Jim's home, for example, 
Joe's Bluetooth phone automatically discovers and con- 
nects with Jim's PC, a well-connected Internet node that 
also runs UIP. Joe can then use his phone to control and 
remotely access the devices in his home, exactly as he 
would if he was at home. Again no configuration is re- 
quired; Joe's home devices and Jim's PC automatically 
conspire with other UIP nodes on the Internet to build the 
necessary forwarding paths. 

Joe's company runs an IPv6 network behind a firewall 
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Figure 2: UIP in the Internet Protocol Architecture 



with a highly restrictive forwarding policy, but the fire- 
wall permits UIP traffic to and from specific internal hosts 
whose installed software the company's network adminis- 
trator trusts. Joe is fortunate enough to have such a trusted 
host at work, which likewise merges into the global UIP 
network. Joe can now access his home devices from work 
and his work PC from home, and he can access any of 
them from his Bluetooth phone while at either location. 
Joe's network administrator must set up the firewall pol- 
icy to allow UIP traffic to Joe's work machine, but Joe 
doesn't have to do anything. 



1.2 UIP's Role in the Internet 

UIP sits on top of existing address-based network-layer 
protocols such as IPv4 and IPv6, and can also operate di- 
rectly over link-layer protocols such as Ethernet, 802.1 1, 
and Bluetooth (see Figure 2). Upper-level protocols and 
applications use UIP in the same way they as they use 
traditional address-based network-layer protocols. In- 
stead of addresses, however, upper-level protocols and 
applications name and connect with other UIP nodes us- 
ing cryptographic identifiers, comparable to Moskowitz's 
proposed host identities [21], Since UIP node identifiers 
have no relationship to network topology, nodes can cre- 
ate their own identifiers without reference to central au- 
thorities, and node identifiers remain valid as long as de- 
sired even as the node moves and the surrounding network 
topology changes. 

This paper focuses purely on UIP's routing and for- 
warding algorithms, leaving other aspects of UIP to be 
developed in future work. For this reason, the exposition 
of the protocol in this paper is high-level and algorithmic 
in nature. The only properties UIP node identifiers have 
that are of importance in this paper are that they are rela- 
tively uniformly distributed in a flat namespace. 



1.3 Key Properties of UIP 

In contrast with conventional routing algorithms for flat 
namespaces, UIP's routing protocol has only 0(log N) 
per-node storage and update traffic requirements. UIP 
achieves this scalability by distributing routing informa- 
tion throughout the network in a self-organizing struc- 
ture adapted from the Kademlia distributed hash table 
(DHT) algorithm [18]. Unlike location-independent nam- 
ing services such as DHTs, UIP does not assume that un- 
derlying protocols provide connectivity between any two 
nodes. When address-based routing protocols fail to pro- 
vide direct connectivity for any reason, such as intermit- 
tent glitches, network address translators, or incompatible 
address-based routing technologies, UIP routes around 
these discontinuities by forwarding traffic through other 
UIP nodes. 

The cost of distributing routing information throughout 
the network for scalability is that individual UIP nodes 
rarely have enough information to determine the shortest 
or "best" possible route to another node. In effect, UIP 
does not implement a distributed "all-pairs shortest paths" 
algorithm like conventional protocols for flat namespaces 
do [15]. Instead, UIP attempts the more moderate goal 
of efficiently finding some path whenever one exists, and 
usually finding reasonably short paths. This goal is ap- 
propriate for UIP since the purpose of UIP is to find com- 
munication paths that address-based protocols such as IP 
cannot find at all. 

In general we cannot expect identity-based routing to 
be as efficient as routing protocols that take advantage of 
the locality and aggregation properties of structured ad- 
dresses. UIP is not intended to replace address-based rout- 
ing protocols, but to complement them. By using address- 
based protocols such as IP to move data efficiently across 
the many "short" hops comprising the core Internet in- 
frastructure and other large managed networks, UIP only 
needs to route data across across a few "long" hops, re- 
solving the discontinuities between address domains and 
bridging managed core networks to ad hoc edge networks. 
For this reason, it is less important for UIP to find the best 
possible route all the time, and more important for the al- 
gorithm to be scalable, robust, and fully self-managing. 

We explore two specific UIP forwarding mechanisms 
based on the same routing protocol. One mechanism guar- 
antees that nodes can operate in 0(log N) space per node 
on any network topology. The other forwarding mecha- 
nism allows UIP to find somewhat better routes and still 
uses 0(log N) space on typical networks, but may require 
O(N) space on worst-case network topologies. With ei- 
ther forwarding mechanism, simulations indicate that UIP 
consistently finds paths that are on average within 2 x the 



length of the best possible path. UIP occasionally chooses 
paths that are much longer than the best possible path, but 
these bad paths are rare. 

1.4 Road Map 

The rest of this paper is organized as follows. Section 2 
details the routing protocol by which UIP nodes organize 
and find paths to other nodes, and Section 3 describes 
the two alternative mechanisms UIP nodes use to forward 
data between indirectly connected nodes. Section 4 evalu- 
ates the routing and forwarding protocol and demonstrates 
key properties through simulations. Section 5 summarizes 
related work, and Section 6 concludes. 



2 The Routing Protocol 

This section describes the distributed lookup and routing 
structure that enables UIP nodes to locate and communi- 
cate with each other by their topology-independent iden- 
tities. 

2.1 Neighbors and Links 

Each node in a UIP network maintains a neighbor table, 
in which the node records information about all the other 
UIP nodes with which it is actively communicating at a 
given point in time, or with which it has recently commu- 
nicated. The nodes listed in the neighbor table of a node 
A are termed A's neighbors. A neighbor of A is not nec- 
essarily "near" to A in either geographic, topological, or 
node identifier space; the presence of a neighbor relation- 
ship merely reflects ongoing or recent pairwise communi- 
cation. 

Some neighbor relationships are mandated by the de- 
sign of the UIP protocol itself as described below, while 
other neighbor relationships are initiated by the actions 
of upper- level protocols. For example, a request by an 
upper-level protocol on node A to send a packet to some 
other node B effectively initiates a new UIP neighbor re- 
lationship between A and B. These neighbor relation- 
ships may turn out to be either ephemeral or long-term. 
A UIP node's neighbor table is analogous to the table an 
IPv4 or IPv6 host must maintain in order to keep track of 
the current path maximum transmission unit (MTU) and 
other vital information about other endpoints currently or 
recently of interest to upper-level protocols. 

As a part of each entry in a node's neighbor table, the 
node's UIP implementation maintains whatever informa- 
tion it needs to send packets to that particular neighbor. 
This information describes a link between the node and its 
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neighbor. A link between two nodes A and B may be ei- 
ther physical or virtual. A physical link is a link for which 
connectivity is provided directly by some underlying pro- 
tocol. For example, if A and B are both well-connected 
nodes on the Internet that can successfully communicate 
via their public IP addresses, then AB is a physical link 
from the perspective of the UIP layer, even though this 
communication path may in reality involve many hops at 
the IP layer and even more hops at the link layer. If a 
physical link is available between A and B, then A and 
B are termed physical neighbors, and each node stores 
the other's IP address or other address information for un- 
derlying protocols in the appropriate entry of its neighbor 
table. 

A virtual link, in contrast, is a link between two nodes 
that can only communicate by forwarding packets through 
one or more intermediaries at the UIP level. We describe 
such nodes as virtual neighbors. The mechanism for UIP- 
layer packet forwarding and the contents of the neigh- 
bor table entries for a node's virtual neighbors will be 
described later in Section 3. For now, however, we will 
simply assume that the following general principle holds. 
Given any two existing physical or virtual links AB and 
BC with endpoint B in common, nodes A and C can con- 
struct a new virtual link AC between them by establish- 
ing a UlP-level forwarding path through B. That is, UIP 
nodes can construct new virtual links recursively from ex- 
isting physical and virtual links. 

In Figure 3, for example, virtual link AC builds on 
physical links AB and BC, and virtual link AD in turn 
builds on virtual link AC and physical link CD. Once 
these virtual links are set up, node A has nodes B, C, and 
D in its neighbor table, the last two being virtual neigh- 
bors. Node D only has nodes C and A as its neighbors; 
D does not necessarily need to know about B in order to 
use virtual link AC. 

2.2 Constructing Virtual Links 

UIP nodes construct new virtual links with a single ba- 
sic mechanism, represented by the build Jink procedure 



// build a link from node n to target node n t , 
II using node n w as a waypoint if necessary 
n.buildJink(n 11 ,,n t ) { 

assert (n and n w are neighbors) 
assert (n w and nt are neighbors) 

try to contact nt by its IP address, MAC address, etc. 
if direct contact attempt succeeds { 

build physical link from n to n t 
} else { 

build virtual link from n to n t via n w 
} 



assert (n and nt are neighbors) 



} 



Figure 4: Pseudocode to Build a Physical or Virtual Link 



shown in Figure 4. A node n can only build a virtual link 
to some other node n t if n already has some "waypoint" 
node n w in its neighbor table, and n w already has n t in 
its neighbor table respectively. Node n can then use the 
build Jink procedure to construct a link from n to n t . 

In the build Jink procedure, n first attempts to initiate 
a direct connection to n t via underlying protocols, using 
any network- or link-layer address(es) for n t that n may 
have learned from n w . For example, if n t is a node with 
several network interfaces each in different address do- 
mains, then n t might publish both the IP addresses and 
the IEEE MAC addresses of all of its network interfaces, 
so that other UIP nodes in any of these domains can ini- 
tiate direct connections with nt even if they don't know 
exactly which domain they are in. If at least one of these 
direct connection attempts succeeds, then n now has n t as 
a physical neighbor, and a virtual link is not necessary. 

If all direct connection attempts fail (or do not succeed 
quickly enough), however, then n constructs a virtual link 
to n t using n w as a forwarding waypoint. In this way, the 
build-link procedure takes advantage of underlying con- 
nectivity for efficiency whenever possible, but succeeds 
even when only indirect connectivity is available. 

2.3 UIP Network Structure 

While virtual links provide a basic forwarding mecha- 
nism, UIP nodes must have an algorithm to determine 
which virtual links to create in order to form a commu- 
nication path between any two nodes. For this purpose, 
all UIP connected nodes in a network self-organize into 
a distributed structure that allows any node to locate and 
build a communication path to any other by resolving the 
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Figure 5: Neighbor Tables, Buckets, and Node ID Space 



// build a communication path from node n 
II to target node n t 
n.build_path(n t ) { 

i= 1 

61 = prox(n,n t ) 

ni = n.neighbor_table[6i] 

while (m =£n t ){ 

bi+i - prox(ni,n t ) 

assert (bi+i > bi) 

tt»+i — Hj — * fi nd. neighbor, in. bucket (£4+1) 
if fi nd. neighbor, in. bucket request failed { 

return failure: node nt does not exist or is not reachable. 
} 

n.build_hnk(rii, m+i) 

assert (m+i is now n's neighbor) 



i = i+l 



target node's identifier one bit at a time from left to right. 
The UIP network structuring algorithm is closely related 
to peer-to-peer distributed hash table (DHT) algorithms 
such as Pastry [30] and Kademlia [18]. Unlike DHTs, 
however, UIP uses this self-organizing structure not only 
to look up information such as the IP or MAC address(es) 
of a node from its UIP identifier, but also as a basis for 
constructing UlP-level forwarding paths between nodes 
for which underlying protocols provide no direct connec- 
tivity. 

For simplicity of exposition we will assume that each 
node has only one identifier, each node's identifier is 
unique, and all identifiers are generated by the same l- 
bit hash function. We will treat UIP node identifiers as 
opaque /-bit binary bit strings. The longest common pre- 
fix (LCP) of two nodes m and ni, written lcp{n\,ni), 
is the longest bit string prefix common to their respective 
UIP identifiers. The proximity of two nodes prox(n\, n-i) 
is the length of lcp(n\ , ni): the number of contiguous bits 
their identifiers have in common starting from the left. For 
example, nodes 1011 and 1001 have an LCP of 10 and 
a proximity of two, while nodes 1011 and 0011 have an 
empty LCP and hence a proximity of zero. Nodes that 
are "closer" in identifier space have a higher proximity. 
Since node identifiers are unique, < prox(n\,ni) < I 
if ri\ 7^ m, and prox(n,n) = I. 

Each node n divides its neighbor table into I buckets, 
as illustrated in Figure 5, and places each of its neigh- 
bors m into bucket &, = prox(n,rii) corresponding to 
that neighbor's proximity to n. This distance metric, also 
known as the XOR metric [18], has the important sym- 
metry property that if node ni falls into bucket b of node 
Hi's neighbor table, then n\ falls into bucket b of 712's 
neighbor table. This symmetry facilitates the establish- 



} 

return success: we now have a working link to nt 



Figure 6: Pseudocode to Build a Path to Any Node 

ment of pairwise relationships between nodes, and allows 
both nodes in such a relationship to benefit from requests 
flowing between them in either direction. 

In order for a UIP network to be fully functional, the 
network must satisfy the following connectivity invariant. 
Each node n perpetually maintains an active connection 
with at least one neighbor in every bucket b, as long a 
reachable node exists anywhere in the network that could 
fit into bucket b. In practice each node attempts to main- 
tain at least k active neighbors in each bucket at all times, 
for some redundancy factor k. 

2.4 Building Communication Paths 

If the connectivity invariant is maintained throughout a 
UIP network, then any node n can communicate with any 
target node n t by the following procedure, outlined in 
pseudocode in Figure 6. 

Node n first looks in bucket b\ = prox(n,nt) of its 
own neighbor table. If this bucket is empty, then nt does 
not exist or is not reachable, and the search fails. If the 
bucket contains n t itself, then the target node is already 
an active neighbor and the search succeeds. Otherwise, 
n picks any neighbor m from bucket b\. Since m's and 
nt's proximity to n are both 61, the first b\ bits of m and 
n t match those of n's identifier, while their immediately 
following bits are both opposite that of n. The proximity 
of ni to n t is therefore at least b\ + 1, 



Node n now sends a message to n\ requesting m's 
nearest neighbor to n t . Node m looks in bucket 62 = 
p(ni,nt) in to neighbor table, and returns information 
about at least one such node, 712, if any are found. The in- 
formation returned includes the UIP identifier of the nodes 
found along with any known IP addresses, IEEE MAC ad- 
dresses, or other underlying protocol addresses for those 
nodes. Node n then uses the build Jink procedure in Fig- 
ure 4 to establish a connection to 712, via a direct physical 
link if possible, or a virtual link through m otherwise. 

Now ri2 is also an active neighbor of n, falling into the 
same bucket of n's neighbor table as n 1 but closer in prox- 
imity to n t . The original node n continues the search it- 
eratively from n-i, resolving at least one bit per step and 
building additional recursive virtual links as needed, un- 
til it finds the desired node or the search fails. If the 
search eventually succeeds, then n will have nt as an ac- 
tive (physical or virtual) neighbor and communication can 
proceed. 

In practice, nodes can improve the robustness and re- 
sponsiveness of the builcLpath procedure by selecting a 
set of up to k neighbor nodes at each iteration and mak- 
ing findjieighbor requests to all of them in parallel, in 
much the same way that Kademlia parallelizes its DHT 
lookups. Parallelizing the construction of UIP commu- 
nication paths has the added benefit that the originating 
node is likely to end up having discovered several alter- 
nate paths to the same node. The originating node can 
evaluate these alternative paths using some suitable crite- 
ria and choose the best of them for subsequent communi- 
cation, and keep information about the others stored away 
for use if the primary path fails. The two endpoint nodes 
can even balance their traffic load across these paths if 
they can find reason to believe that the paths are suffi- 
ciently independent for load-balancing to be effective in 
improving overall performance. 

2.5 The Merge Procedure 

The above build.path procedure is much like the lookup 
procedure used in the Kademlia DHT, modified to support 
construction of indirect forwarding paths between nodes 
that cannot communicate directly via underlying proto- 
cols. For network construction and maintenance, how- 
ever, UIP requires a much more robust algorithm than 
those used in Kademlia and other DHTs. DHTs gener- 
ally assume not only that underlying protocols provide 
full any-to-any connectivity between nodes, but also that 
nodes join or leave the network at a limited rate and rel- 
atively independently of each other. In the discontinuous 
network topologies on which UIP is intended to run, how- 



// merge node n into the portion of a network 
// reachable from neighbor n\ 
n.merge(ni) { 

i= 1 

61 =prox(n,ni) 

while (bi < I) { 

for j = thru (6; - 1) { 

if n.neighbor_table[j] not already full { 
rij — Ui — > fi nd. neighbor, in. bucket (j) 
if fi nd. neighbor, in. bucket request succeeded { 

n.build Jink(rii , rij ) 
} 



} 



} 



TH+i — ni — ► fi nd. neighbor, in. bucket (h) 
if fi nd. neighbor, in. bucket request failed 

break 
&»+i =prox(n,n i+1 ) 
assert (bi+i > bi) 



R.buildJink(rii, rij+i) 
i = i+l 



} 



Figure 7: Pseudocode to Merge a Node Into a Network 



ever, a single broken link can split the network at arbitrary 
points, causing the nodes in either partition to perceive 
that all the nodes in the other partition have disappeared 
en masse. If the network split persists for some time, the 
nodes on either side will re-form into two separate net- 
works, which must somehow be merged again once the 
networks are re-connected. 

Our algorithm assumes that underlying protocols pro- 
vide some means by which topologically near UIP nodes 
can discover each other and establish physical neighbor 
relationships. For example, UIP nodes might use Eth- 
ernet broadcasts IPv4 subnet broadcasts, or IPv6 neigh- 
bor discovery to detect nearby neighbors automatically. 
Nodes might also contain "hard-coded" IP addresses of 
some well-known UIP nodes on the Internet, so that nodes 
with working Internet connections can quickly merge into 
the public Internet-wide UIP network. Finally, the user 
might in some cases explicitly provide the address infor- 
mation necessary to establish contact with other relevant 
UIP nodes. Whenever a new physical link is established 
by any of the above means, the node on each end of the 
link performs the merge procedure outlined in Figure 7, 
to merge itself into the network reachable from the other 
node. 



The merge process works as follows. Suppose that 
node n has node n\ as a neighbor, falling in bucket 
b\ = p(n, m) in its neighbor table. If b\ > 0, then n and 
m have one or more initial identifier bits in common, and 
any neighbors of m in buckets through b\ — 1 are also 
suitable for the corresponding buckets in n's neighbor ta- 
ble. Node n therefore requests information from m about 
about at least one of n\ 's neighbors in each of these buck- 
ets, and builds a physical or virtual (via n\) link to that 
node. Assuming ni's neighbor table satisfied the connec- 
tivity invariant, n's neighbor table now does as well for 
buckets through b\ — 1. 

Node n now asks m for any neighbor from m's bucket 
&i other than n itself, as if n was searching for its own 
identifier in ni's network. If such a node ri2 is found, 
then its proximity 62 = p(n, n^) must be at least b\ + 1. 
Node n builds a link to n^ via n\, fills any empty buckets 
< bi < 62 from n2's neighbor table as above, and then 
continues the process from n2 for neighbors with prox- 
imity greater than 62- Eventually n reaches some node 
m with proximity bi, whose bucket bi contains no neigh- 
bors other than n itself. This means that there are no other 
nodes in n\ 's network with greater proximity to n than pi, 
and so n has satisfied the connectivity invariant in its own 
neighbor table, at least with respect to the portion of the 
network reachable from n\ . 



2.6 Merge Notifications 

After a node n merges into another node ni's network 
via the merge procedure above, however, there may be 
other nodes in ni's network besides the ones that n con- 
tacted directly that also need to learn about n before their 
neighbor tables will satisfy the connectivity invariant for 
the new, larger network. In addition, n may not be just 
a "lone" node joining ni's network, but may instead be a 
member of a larger existing network (reachable from n's 
neighbor table) that previously split from or evolved inde- 
pendently from ni's network. In this case, many nodes 
in n's network may need to learn about nodes in ni's 
network, and vice versa, before the connectivity invariant 
will be re-established globally. 

To cause other nodes to update their neighbor tables 
appropriately, UIP uses a simple notification mechanism. 
Whenever a node n makes contact for any reason with 
a new physical or virtual neighbor n n , and bucket b n — 
prox(n, n n ) of n's neighbor table was not full before the 
addition of n„, n sends a message to all of its existing 
neighbors notifying them of the new node n n . In response 
to this notification message, each of n's existing neighbors 
Hi contacts n n via ni.buildJink(n, n„), and then likewise 



merges into n„'s network via ni.merge(n„). If this pro- 
cess helps m to fill any of its previously underfull neigh- 
bor table buckets, then m subsequently sends notifications 
to its neighbors, and so on. The chain reaction stops when 
all of the affected nodes cease finding new nodes that fit 
into underfull buckets in their neighbor tables. 

To understand this process, consider two initially sepa- 
rate UIP networks: a "red" network consisting of i nodes 
r\ . . . ri, and a "green" network consisting of j nodes 
gx . . . gj. We say that any given node n satisfies the red 
connectivity invariant if each bucket in n's neighbor table 
contains at least one red node if any red node exists that 
could fit into that bucket. Similarly, we say that a node 
n satisfies the green connectivity invariant if each of n's 
buckets contains at least one green node if any green node 
exists that could fit into that bucket. We assume that all 
green nodes initially satisfy the green connectivity invari- 
ant, but no green nodes satisfy the red connectivity invari- 
ant because there are initially no connections between the 
red and green networks. Similarly, all red nodes satisfy 
the red connectivity invariant but no red nodes satisfy the 
green connectivity invariant. 

Now suppose that a physical link is somehow estab- 
lished between nodes n and g\, connecting the two net- 
works. In response, n performs a merge(gi), filling any 
underfull buckets in its neighbor table that can be filled 
from green nodes reachable from g\, and g\ likewise per- 
forms a merge (ri) to fill its buckets from nodes in the 
red network. Node r\ effectively locates and builds links 
with its nearest (highest-proximity) neighbors in the green 
network, and g\ likewise locates and builds links with its 
nearest neighbors in the red network. As a result, after the 
merge process r\ satisfies the green connectivity invari- 
ant and g\ satisfies the red connectivity invariant. Since 
n and g\ already satisfied the red and green invariants, 
respectively, and adding new neighbors to a node's neigh- 
bor table cannot "un-satisfy" a previously satisfied con- 
nectivity invariant, both n and g\ now satisfy the global 
connectivity invariant covering both red and green nodes. 

Assuming node identifiers are reasonably uniformly 
distributed, with high probability one or both of r\ and 
g\ will find one or more new nodes in the opposite net- 
work that fit into previously underfull buckets. Before the 
merge, bucket b — prox(ri,gi) in both n and g\ may 
already have been full, which is likely if n and g\ are 
far apart in identifier space. There may even be no nodes 
in the green network that fall into underfull buckets in n, 
but this event is unlikely unless the green network is much 
smaller than the red network. Similarly, there may be no 
nodes in the red network that fall into underfull buckets 
in gi, but only if the red network is much smaller than 



the green network. If the two networks are similar in size, 
then both r\ and g\ will almost certainly find new neigh- 
bors that fit into underfull buckets. 

At any rate, the discovery of new neighbors falling in 
these underfull buckets causes r\ and/or g\ to send merge 
notifications to their existing neighbors in the red and 
green networks, respectively, supplying a link to the op- 
posite node as a "hint" from which other nodes in each 
network can start their merge processes. Each node in ei- 
ther network that is notified in this way initiates its own 
merge process to fill its neighbor table from nodes in the 
other network, in the process triggering the merge process 
in its other neighbors, eventually leaving all nodes satis- 
fying the global connectivity invariant. 

In practice it is important to ensure that the inevitable 
flurry of merge notifications does not swamp the whole 
network, especially when two relatively large networks 
merge. Standard protocol engineering solutions apply to 
this problem, however, such as rate-limiting the accep- 
tance or spread of notifications, propagating merge no- 
tifications periodically in batches, and keeping a cache in 
each node of recently-seen merge notifications to avoid 
performing the same merge many times in response to 
equivalent merge notifications received from different 
neighbors. 

3 Packet Forwarding 

The previous section described how UIP nodes form a 
self-organizing structure in which any node can build a 
communication path to any other node by recursively con- 
structing virtual links on top of other links, but did not 
specify exactly how virtual links operate. In this sec- 
tion we explore the construction and maintenance of vir- 
tual links in more detail. We will explore in particular 
two alternative methods for implementing virtual links: 
one based on source routing, the other based on recur- 
sive tunneling. Source routing potentially enables nodes 
to find more efficient routes and keeps the basic forward- 
ing mechanism as simple as possible, while the recursive 
tunneling approach minimizes the amount of state each 
node must maintain in its neighbor table. 

3.1 Source Routing 

With source routing, each entry in a node's neighbor ta- 
ble that represents a virtual neighbor contains a complete 
source route to the target node. The source route lists the 
UIP identifiers of a sequence of nodes, starting with the 
origin node and ending with the target node, such that 
each adjacent pair in the sequence has (or recently had) 
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Figure 8: Source Routing versus Recursive Tunneling 



a working physical link between them. Of course, since 
these links need only be "physical" from the perspective 
of the UIP layer, each link in a UIP source route may rep- 
resent many hops at the IP routing or link layers. 

Consider for example Figure 8, in which the five nodes 

A, B, C, D, E are connected by a chain of physical links. 
Nodes A and C have established a virtual link AC by 
building a two-hop source route via their mutual neighbor 

B, and nodes C and E have similarly established a vir- 
tual link CE via D. Suppose node A subsequently learns 
about E from C and desires to create a virtual link AE 
via C. Node A contacts C requesting C"s source route to 
E, and then appends C's source route for CE (A, B, C) 
to A's existing source route for AC (C, D, E), yielding 
the complete physical route A, B, C, D, E. 

To send a packet to E, node A includes in the packet's 
UIP header the complete source route for the virtual link 
AE stored in its neighbor table entry for E. Each UIP 
node along the path examines the header to find the 
packet's current position along its path, and bumps this 
position indicator to the next position before forwarding 
the packet to the next UIP node in the path. Forwarding 
by source routing in UIP is thus essentially equivalent to 
source routing in IP [6]. 

In theory each node may have to store up to I x k entries 
in its neighbor table, where I is the node identifier size and 
hence the number of buckets in the neighbor table, and k 
is the redundancy factor within each bucket. In practice 
only the top log2N buckets will be non-empty, where N 
is the total number of nodes in the network. With source 
route forwarding, neighbor table entries may have to hold 
source routes for paths up to N — 1 hops in length, in 
the worst-case network topology of N nodes connected 
together in one long chain. In this case each node may 
require 0(N log N) storage. In practical networks these 



source routes will of course be much shorter, so this large 
worst-case storage requirement may not be a problem. 



3.2 Recursive Tunneling 

In contrast with source routing, where each entry in a 
node's neighbor table for a virtual neighbor contains a 
complete, explicit route that depends only on physical 
links, recursive tunneling preserves the abstraction prop- 
erties of neighbor relationships by allowing the forward- 
ing path describing a virtual link to refer to both physical 
and (other) virtual links. As a result, each neighbor table 
entry representing a virtual link only needs to hold two 
UIP identifiers: the identifier of the target node, and the 
identifier of the "waypoint" through which the virtual link 
was constructed. Recursive tunneling therefore guaran- 
tees that each node requires at most 0(log N) storage, 
since neighbor table entries have constant size. 

In the example in Figure 8, node A has constructed vir- 
tual link AC via B, and node C has constructed virtual 
link CE via D, and as before, A learns about E from C 
and wants to construct a virtual link AE via C. With re- 
cursive tunneling, A does not need to duplicate its route 
C or ask C for information about its route to E in order to 
construct its new virtual link to E. Instead, A merely de- 
pends on the knowledge that it already knows how to get 
to C, and that C knows how to get to E, and constructs 
a neighbor table entry for E describing the "high-level" 
two-hop forwarding path A, C, E. 

Recursive tunneling has several beneficial properties. 
First, since each neighbor table entry for a virtual neigh- 
bor needs to store only two UIP identifiers, the size of 
each neighbor table entry can be limited to a constant, and 
the size of a node's entire neighbor table depends only on 
the size of UIP identifiers (and hence the number of buck- 
ets), and the number of entries in each bucket. Second, if 
"low-level routes" in the network change, all "higher-level 
routes" that are built on them will immediately use the 
correct, updated information with no information propa- 
gation delays. For example, if node D above goes down 
making the path C,D,E unavailable, but C finds an al- 
ternate route to E, then the virtual link AE will automati- 
cally use this new route without A even having to be aware 
that something in C's neighbor table changed. 

The actual packet forwarding mechanism for recursive 
tunneling is of course slightly more involved than for 
source routing. As illustrated in Figure 9, to send a packet 
to E, node A wraps the packet data in three successive 
headers. First, it prepends a UIP tunneling header describ- 
ing the "second-level" virtual path from A to E via C. 
Only nodes C and E will examine this header. Second, 
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A prepends a second UIP tunneling header describing the 
"first-level" virtual path from A to C via B. Finally, A 
prepends the appropriate lower-layer protocol's header, 
such as an IP or Ethernet header, necessary to transmit 
the packet via the physical link from A to B. 

When the packet reaches node B, B strips off the 
lower-layer protocol header, and looks in the first-level 
(outer) UIP tunneling header to find the UIP identifier of 
the next hop. B then looks up this identifier in its neighbor 
table, prepends the appropriate (new) lower-layer protocol 
header, and transmits the packet to C. 

When the packet reaches node C, C strips off both the 
lower-layer protocol header and the first-level UIP tunnel- 
ing header (since C was the destination according to that 
header), and examines the second-level tunneling header 
to find the final destination, E. C now looks up E in its 
neighbor table and, finding that E is a first-level virtual 
neighbor, C prepends a new first-level tunneling header 
describing the route from C to E via D. Finally, C 
prepends the lower-layer protocol header for the physi- 
cal link from C to D and forwards the message to D. D 
subsequently forwards the message to E, which finally 
strips off the lower-layer protocol header and both of the 
tunneling headers before interpreting the packet data. 

3.3 Path Optimization 

When an upper-layer protocol on one node attempts to 
contact some other node via UIP, the builcLpath pro- 
cedure described in Section 2.4 searches the network 
structure for the requested node identifier, and in the 
process may build one or more virtual links using the 
build Jink procedure of Section 2.1. The search process 
through which these virtual links are constructed is essen- 
tially driven by the distance relationships in UIP identifier 
space, which have nothing to do with distance relation- 
ships in the underlying physical topology. 

Each UIP node has complete flexibility, however, in the 
way it chooses the k nodes to fill a particular bucket in 



its neighbor table whenever there are more than k nodes 
in the network that could fit into that bucket. If the net- 
work contains TV nodes with uniformly distributed identi- 
fiers, then we expect nodes to have some flexibility in their 
choice of neighbors throughout the first log2N — logik 
buckets. Further, we naturally expect nodes to select the 
"best" k nodes they find for each such bucket: either the 
closest in terms of physical topology (UIP hop count), or 
the best according to some other pragmatic measure in- 
volving latency, bandwidth, and/or reliability for exam- 
ple. 

In general, therefore, we expect the first few iterations 
of the builcLpath process to stay within the node's im- 
mediate topological vicinity, with subsequent hops cov- 
ering larger topological distances as the remaining dis- 
tance in identifier space is progressively narrowed. While 
the first few build.path hops will depend only on physi- 
cal or inexpensive "low-order" virtual links, the last few 
hops might each depend on an expensive "high-order" vir- 
tual link, eventually resulting in a communication path 
that criscrosses throughout the network in a highly non- 
optimal fashion. It is therefore important that we find a 
way to optimize the routes produced using this process. 



The most basic path optimization is inherent in the 
builcLlink procedure. If a node A locates target node 
B via the build.path process, but A subsequently finds 
that it can contact B directly using underlying protocols 
such as IP using address information it discovers during 
the process, then build Jink will "short-circuit" the path 
from A to B with a physical link requiring no UlP-level 
forwarding. 

A second important path optimization is for nodes to 
check for obvious redundancies in the routes produced as 
higher-order virtual links are built from lower-order vir- 
tual links. Source routing makes this type of path opti- 
mization easier, since each node has information about 
the complete physical route to each neighbor in its neigh- 
bor table, but we will explore a more limited form of path 
optimization as well that works with recursive tunneling. 
Other path more sophisticated forms of path optimization 
are certainly possible and desirable, such as optimizations 
relying on a deeper analysis of the relationships between 
known neighbors, or based on additional information ex- 
changed between neighbors beyond the minimal infor- 
mation requred to maintain the network and build virtual 
links. We leave more advanced path optimizations for fu- 
ture work, however, and focus for now on the effects of 
simple optimizations that rely on strictly local informa- 
tion. 
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Figure 10: Path optimization opportunities on different 
topologies, when A builds a virtual link to F via D. 



3.3 J Source Route Optimization 

In UIP forwarding by source routing, we optimize source 
routes when combining two shorter paths into a longer one 
simply by checking for nodes that appear in both shorter 
paths. For example, in Figure 10(a), suppose node A has 
established a virtual link AD via B with path A, B, C, D, 
by building on virtual link BD with path B, C, D. A vir- 
tual link also exists between D and F, A now learns about 
F through D and attempts to create a virtual link AF via 
D. Without path optimization, the resulting path will be 
A, B, C, D, C, B, F. The path can be trivially shortened 
to the optimal A, B, F, however, simply by noting that 
B appears twice and eliminating the redundant hops be- 
tween them. 

The same optimization shortens the path from A to F 
in Figure 10(b) from A, B, C, D, C, E, F to the optimal 
A, B, C, E, F. This path optimization does not help in 
the case of Figure 10(c), however, since A does not nec- 
essarily know that B and E are direct neighbors. 

3.3.2 Recursive Tunnel Optimization 

Path optimization is not as easy in forwarding by recursive 
tunnels, because the information needed to perform the 
optimization is more spread out through the network. For 
example, in Figure 10(a), node A knows that the first hop 
along virtual link AD is the physical link AB, but A does 
not necessarily know what type of link BD is and may 
not even know that node C exists. 

In general, for any virtual link from n to n\ via ni, node 
n also contains in its neighbor table a virtual or physical 
link representing the first hop from n to n-i. If the lower- 
order link from n to 712 is a virtual link via some node 
ri3, then n also contains in its neighbor table a physical or 
virtual link from n to 713, and so on. We call this chain of 
intermediate nodes along the path from n to n\ that n in- 
herently knows about n's first hop chain for ri\. For exam- 
ple, A's first hop chain for D in Figure 10(a) is A, B, D, 
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whereas D's, first hop chain for A is D, C, B, A. 

To implement path optimization for recursive tunnels, 
we extend the build Jink procedure of Section 2. 1 so that 
when a node n attempts to build a new virtual link to nt 
via waypoint node n w , n contacts its existing neighbor n w 
requesting ti ffi 's first hop chain for n t . Node n then com- 
pares the information returned against its own first hop 
chain for n w , and short-circuits any redundant path ele- 
ments. 

For example, in Figure 10(a), node A is building a vir- 
tual link to F via D, so A requests D's first hop chain 
to F, which is D, C, B, F. A compares this chain with 
its first hop chain for D, which is A, B, D, discovering 
redundant node B and shortening the path to A, B, F. 

This form of path optimization does not help in Fig- 
ure 10(b), however, where the redundant path component 
between C and D is hidden from A because C is not in 
A's first hop chain. Similarly, this optimization does not 
handle Figure 10(c) for the same reason that the source 
routing optimization above fails. 



4 Protocol Evaluation 

In this section we use simulation results to evaluate the be- 
havior of UIP's routing and forwarding protocol. A "real- 
world" implementation of the protocol is under develop- 
ment, but until an implementation has been deployed and 
a substantial critical mass of users has developed, sim- 
ulations provide the only realistic option for tuning the 
protocol and predicting how it will behave on the large 
networks it is intended to support. 

4.1 Performance Metrics 

In order to asses the basic viability of the UIP routing pro- 
tocol, we focus here on measuring the efficiency of the 
network paths the protocol finds through random network 
topologies. Many other important factors that will affect 
the performance of real-world UIP networks remain for 
future study. In particular, while our simulations confirm 
that the protocol recovers from node failures and network 
partitions, we do not yet have a full characterization of 
the dynamic behavior of a UIP network under continuous 
change. 

In order to measure the efficiency of routing paths cho- 
sen by UIP nodes, we define the UIP path length between 
two nodes m and n-i to be the total number of physi- 
cal hops in the path that n\ constructs to ni using the 
builcLpath procedure in Figure 6. We define the stretch 
between n\ and ri2 to be the ratio of the UIP path length to 



the length of the best possible path through the underlying 
topology. 

We measure the stretch for a given pair of nodes by 
using builcLpath to construct a path from one node to 
the other, measuring the total number of physical hops 
in the path, and then eliminating all the virtual links that 
builcLpath constructed so that the measurement of one 
path does not affect the measurement of subsequent paths. 
On networks of 100 nodes or less we measure all possible 
paths between any two nodes; on larger networks we take 
a sample of 10,000 randomly chosen node pairs. 

4.2 Test Network Topology 

Selecting appropriate network topologies for simulations 
of UIP is difficult, because we have no way to predict the 
topologies of the networks on which a protocol like UIP 
will actually be deployed. Using topological maps of the 
existing IPv4 Internet would not make sense: the exist- 
ing well-connected Internet is precisely the portion of to- 
day's global network infrastructure across which UIP will 
not have to find paths, because IP already does that well 
enough, and UIP simply treats these paths as direct phys- 
ical links. For the function UIP is designed provide, find- 
ing paths between nodes on the Internet and nodes on the 
many private and ad hoc networks attached to it, no reli- 
able topological data is available precisely because most 
of these adjoining networks are private. 

Nevertheless, we can construct artificial topologies that 
approximate the most important characteristics we believe 
this global network infrastructure to have. First, we expect 
the topology on which UIP is deployed to consist of many 
clusters, in which each node in a given cluster can reliably 
address and connect with any other node in the same clus- 
ter, but nodes in one cluster have very limited connectivity 
to nodes in other clusters. Second, because of the diver- 
sity of existing networking technologies and deployment 
scenarios, we expect the size of these clusters to follow a 
power law distribution, with larger clusters having better 
connectivity to neighboring clusters. Finally, we expect 
all of these clusters to be within at most a few hops from 
a single huge, central cluster, namely the public IP-based 
Internet. 

To construct an artificial topology having these char- 
acteristics, we start with a single distinguished cluster we 
will call the root cluster, initially containing a single node. 
We then randomly "grow" the network one node at a time 
as follows. For each new node, we choose the number 
of attachment points the node will have based on a ge- 
ometric random variable with a multihoming probability 
parameter p m . Approximately p m N of the network's N 
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Figure 1 1 : Network path stretch for source routing versus 
recursive tunneling 



nodes will have at least two attachment points, p^N have 
at least three attachment points, and so on. 

We choose each attachment point for a new node via 
a random walk from the root cluster using a downstream 
probability parameter pd and a new cluster probability pa- 
rameter p n . At each step, with probability pd we move the 
attachment point downstream, and with probability 1—pd 
we terminate the process. To move the attachment point 
downstream, we choose a node at random from the cur- 
rent cluster, then we either create a new cluster "private" 
to that node with probability p n , or else we with prob- 
ability 1 — p n we pick at random any cluster that node 
is attached to (which could be the cluster we just came 
from). Once the random walk terminates, we add the new 
node to the cluster at which the walk ended. 

We call the resulting random network topology a rooted 
topology, since it consists of many small clusters cen- 
tered around the single large root cluster, approximating 
the well-connected IP-based Internet surrounded by many 
smaller private networks. 

We choose somewhat arbitrarily the following "base- 
line" parameters for our experiments. We use network 
topologies of varying sizes constructed with a multihom- 
ing probability p m = 1/10, a downstream probability 
Pd = 3/4, and new link probility p n = 1/2. On these 
topologies we build UIP networks with a redundancy fac- 
tor k = 3, by adding nodes to the network one at a time in 
random order. We will vary these parameters to explore 
their impact on the efficiency of the routing protocol. 



4.3 Source Routing versus Recursive Tun- 
neling 

In Figure 1 1 we measure the average and maximum path 
stretch observed (vertical axis) between any two nodes on 
networks of a given size (horizontal axis), for both source 
routing and recursive tunneling. The error bars indicate 
standard deviation of the measured stretch. In the ran- 
dom 10,000-node rooted topology, the root cluster con- 
tains 3233 nodes (32% of the network), the average dis- 
tance between any two nodes is 2.5 hops, and the maxi- 
mum distance between any two nodes (total network di- 
ameter) is 8 hops. 

With both source routing and recursive tunneling, we 
see that the UIP routing protocol consistently finds paths 
that are on average no more than twice as long as the best 
possible path. The average-case efficiency of recursive 
tunneling is slightly worse than for source routing, due 
to the more limited amount of information nodes have to 
optimize paths they find through the network. The rout- 
ing protocol occasionally chooses very bad paths — up to 
6x stretch for source routing and up to 16 x for recursive 
tunneling — but the low standard deviation indicates that 
these bad paths occur very rarely. 

4.4 Rooted versus Unrooted Networks 

We would next like to determine how much the UIP rout- 
ing protocol benefits from the tree-like structure of rooted 
network topologies. Is the UIP routing protocol only vi- 
able when some underlying protocol such as IP is doing 
most of the work of routing within the large central clus- 
ter, or could UIP routing also be used to internetwork 
a number of small link-layer networks joined in ad-hoc 
fashion? 

To explore this question, we modify the random net- 
work creation procedure of Section 4.2 so that the random 
walk to find each new attachment point for a given starts 
at a cluster chosen uniformly at random from all exist- 
ing clusters, rather than at a well-known root cluster. The 
resulting unrooted topologies have a much more uniform 
and unpolarized distribution in their cluster sizes and in 
the connections between clusters. In the random 10,000- 
node unrooted topology, for example, the largest cluster 
contains only 1 1 nodes, the average distance between any 
two nodes is 7.7 hops, and the network diameter is 19 
hops. We expect efficient routing on such a diffuse net- 
work to be more difficult than on a rooted network. 

Figure 12 compares the path efficiency of UIP source 
route-based forwarding on rooted and unrooted networks 
of varying sizes. We find that unrooted networks indeed 
yield greater stretch, but not by a particularly wide mar- 
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gin. This result suggests that UIP routing is not highly de- 
pendent on rooted topologies and may be useable as well 
on more diffuse topologies. 

5 Related Work 

In this section we first compare the UIP routing and for- 
warding protocol with existing routing algorithms for both 
wired and ad hoc networks, then we relate UIP to location- 
independent name services and other systems with similar 
features. 

5.1 Routing Algorithms 

In classic distance-vector algorithms [15] such as 
RIP [12], as well as variants such as MS [19], WRP [23], 
and DSDV [26], each router continuously maintains rout- 
ing information about every other addressable node or 
subnet. With these protocols, each router requires at least 
O(N) storage for a network of size N, and must reg- 
ularly exchange connectivity information of size O(N) 
with each of its neighbors. 

In link-state algorithms such as OSPF [22] and 
FSR [24], routers maintain complete network connectiv- 
ity maps. This approach can achieve faster routing ta- 
ble convergence and avoid the looping problems of ba- 
sic distance-vector algorithms, at the cost of even greater 
storage requirements and maintenance overhead. 

Reactive or "on demand" routing algorithms designed 
for ad hoc networks, such as DSR [14] and AODV [25], 
require routers to store information only about currently 
active routes, limiting maintenance traffic and storage 
overheads on networks with localized traffic patterns. 



Routing queries for distant nodes may have to be broad- 
cast through most of the network before the desired route 
is found, however, limiting the scalability of these proto- 
cols on networks with global traffic patterns. 

Landmark [33], and related hierarchical protocols such 
as LANMAR [9], L+ [3], and PeerNet [8], dynamically 
arrange mobile nodes into a tree. The routing protocol 
assigns each node a hierarchical address corresponding to 
its current location in this tree, and implements a location- 
indepdendent identity-based lookup service by which the 
current address of any node can be found. Each non-leaf 
node serves as a landmark for all of its children, and is re- 
sponsible for routing traffic to them from nodes outside its 
local subtree. Landmark routes local traffic purely within 
the lowest levels of the tree, providing scalability when 
traffic patterns are predominantly local. Since global traf- 
fic must pass through the landmark nodes at the upper lev- 
els of the hierarchy, however, these upper-level nodes are 
easily overloaded in a network with global traffic patterns. 

5.2 Location- Independent Name Services 

Naming services such as the Internet's domain name sys- 
tem (DNS) [20] can translate location-independent node 
names on demand to location-specific addresses. Name 
services inherently assume, however, that each node has 
some globally unique address at which it can be reached 
from all other nodes. If a desired node is on a private IP 
network behind a network address translator, for exam- 
ple, then there is generally no IP address by which it can 
be reached from outside the network, and name services 
do not help. Name-based routing [10] can bridge multiple 
IP address domains using DNS names, but its dependence 
on the centrally-administered DNS namespace makes it 
unsuitable for ad hoc networks. 

Recent distributed hash table (DHT) algorithms such 
as Pastry [30], Chord [5], and Kademlia [18], implement 
fully decentralized, self-organizing name services that do 
not depend on top-down, hierarchical administration as 
DNS and other traditional name services do. The UIP 
routing protocol uses a self-organizing network structure 
closely related to the Kademlia DHT. Like conventional 
name services, however, DHT algorithms do not provide 
network-layer routing functionality. Although the process 
of locating up an item in a DHT is sometimes called "rout- 
ing" because it involves iteratively contacting a sequence 
of nodes that are progressively closer to the desired item 
in identifier space, this process still assumes that the node 
initiating the lookup can directly contact each of the nodes 
in the sequence using underlying protocols. UIP's virtual 
link abstraction described in Section 2, and the forward- 
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ing mechanisms described in Section 3, provide the fun- 
damental new functionality required to turn the Kademlia 
DHT into a network-layer routing protocol. 

The Internet Indirection Infrastructure (i3) [32] uses 
a similar peer-to-peer search structure to implement 
location-independent naming and communication with 
multicast and anycast support, ii provides special-case 
support for forwarding traffic to hosts behind firewalls, 
but it depends on all participating hosts being connected 
to the Internet at all times and does not implement general 
network-layer routing functionality. 

5.3 Other Systems 

A resilient overlay networks (RON) [1] serves a function 
similar in spirit to UIP, increasing the reliability of an IP 
network by detecting connectivity failures in the under- 
lying network and forwarding traffic around them. RON 
makes no attempt at scalability beyond a few dozen nodes, 
however, and assumes that all participating nodes have 
unique IP addresses. 

Several protocols have been developed to provide 
connectivity through firewalls and NATs, such as 
SOCKS [17], STUN [29], andUPnP [34]. These special- 
purpose protocols are tied to the characteristics and net- 
work topologies of commonly deployed NATs and fire- 
walls, however, and do not solve the more general prob- 
lem of routing between different address domains con- 
nected in arbitrary fashion. 



6 Conclusion 

Today's global network infrastructure has grown in size 
and diversity beyond the reach of any single address- 
based internetworking protocol. IPv4 and IPv6 have the 
scalability necessary to route between millions or billions 
of nodes, but their centrally-administered hierarchical ad- 
dress domains make edge networks dependent on either 
tedious manual address assignment or continual connec- 
tivity to address services such as DHCP. Existing ad hoc 
networking protocols are fully self-configuring, but they 
do not have the scalability of IP. 

UIP, a scalable identity-based routing protocol, stitches 
together multiple address-based routing domains into a 
single flat namespace, enabling any-to-any communi- 
cation via location-independent node identifiers. With 
identity-based routing, participating nodes in private IP 
address domains and ad hoc edge networks become uni- 
formly accessible from anywhere while connected to the 
global Internet. Even while disconnected from the global 



Internet, however, with UIP these edge networks remain 
functional and maximally interconnected. 

Simulation-based experiments with UIP indicate that 
its routing and forwarding protocol is practical and scal- 
able. Although UIP nodes do not have enough informa- 
tion to choose the best possible routes to other nodes, the 
routes chosen by UIP nodes are on average no more than 
twice as long as the optimal route. These preliminary re- 
sults suggest that identity-based routing on Internet-scale 
networks may indeed be viable. 
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